-
-
Notifications
You must be signed in to change notification settings - Fork 394
fix!: consider non-files as pruned officially. #1730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, these were misclassified as `File`, which can lead to blocking applications which get stuck reading pipes. Now the downstream is forced to deal with the possibility that the item at hand isn't a file, to do application-specific things.
…hem when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and I've verified that it fixes #1729 in gix clean, skipping over the files changed to pipes with panicking, as well as showing just the typechange in gix status output. To summarize:
$ rm README.md
$ mkfifo README.md
$ cargo run --bin=gix -- status
Compiling gitoxide v0.39.0 (/home/ek/repos/gitoxide)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 6.57s
Running `target/debug/gix status`
08:51:43 status done 2.3K files in 0.02s (116.6K files/s)
T README.md
head -> index isn't implemented yet
$ cargo run --bin=gix -- clean -n
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.51s
Running `target/debug/gix clean -n`
Nothing to clean (Skipped 1 expendable entry - show with -x)
I wonder if there is a clearer name than NonFile. I haven't thought of one, though. Also, it seems to me that it may be feasible to commit generated archives that contain named pipes. I've commented below with my thinking about those things.
| /// Something that is not a file, like a named pipe or character device. | ||
| /// | ||
| /// These can only exist in the filesystem. | ||
| NonFile, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best thing is to call filesystem entries of kinds that are not representable in Git repositories. In my experience, usually "file" is either construed broadly to include such entries well as directories and symbolic links, or construed narrowly to mean "regular file" and thus exclude directories and symbolic links.
I think it is unintuitive for regular files, directories, and symbolic links all not to be NonFiles, at the same time as FIFOs, sockets, character and block devices, and other more exotic kinds of filesystem entries are NonFiles (though at least one such kind of entry, a whiteout, is very intuitively a non-file).
I don't know if it's possible to name NonFile better--which is why I'm saying this instead of opening a PR to rename it. The documentation comment can probably be expanded to clarify fully what NonFile is everything other than, as well as to list the other common examples of NonFiles, and I do plan to open a PR for that. It may be that documentation in other places can be usefully expanded too; I'll look into that.
This would not be the first technical usage of "non-" with a non-literal meaning. For example, in git commit -am message, only commit is said to be a "non-option argument," even though the operand message is unambiguously an argument that is not an option argument. Sometimes this is hard to avoid, or maybe not worth avoiding.
However, this does seem to be an area where confusion is possible, or at least possible for me. For example, when I read the description and commit message in #1629, which quoted from a comment in an early version of Git a few weeks before Git began to track symbolic links, I took "non-regular files" to encompass symbolic links and was worried that the code there might wrongly remove support for them.
In fact, the code there deliberately preserved support for them (as did #1727, which also used that phrase). But the same version of the comment that Git developers used early on to describe a situation that encompassed the absence of symlink support was taken to encompass the presence of symlink support when cited.
Another possible source of confusion is that some kinds of NonFiles are distinguished from related entities by being files. For example, a FIFO (named pipe) differs from an anonymous pipe by being a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to make this point of ambiguity so clear.
I'd definitely welcome a PR which changes the variant name and/or improves the documentation - right now it's definitely very brief.
Other also comes to mind as catch-all for everything that isn't the other variants. Technically that's a good fit as the logic literally does it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #1734 for this. For now, I have not attempted to rename NonFile.
| status_unchanged.tar | ||
| status_changed.tar | ||
| symlink_stack.tar | ||
| status_nonfile.tar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could generated archives be committed for this?
I think all tests using fixtures this covers are run only only on Unix-like systems (so the Windows differences discussed in #1727 (review) would not be a problem).
With the tar command, a FIFO can be included in a .tar archive and is recreated when that archive is extracted. I'm not sure about other approaches to taring and untaring, but gix-testtools uses the tar crate. The tar crate seems to support FIFOs, which are listed here, whereas kinds of filesystem entries that don't tar, like sockets, are omitted from that enum.
(A test of the tar crate with FIFOs is run only on Linux, but I understand the comment there to mean that this is solely due to a CI limitation and not to any decreased ability to archive and extract them on other Unix-like systems.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, I ran into this locally and on MacOS it definitely didn't properly retain the pipe status of the non-files, or it couldn't recreate it. They ended up as normal files after extracting them from the tar archive.
Since the script doesn't do much, I think it's also fine to not have it cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a commit was pushed, that might still be accessible, that has the .tar file, then I could look at it to see if the FIFO didn't end up in it, or if the issue was extraction. (But if not, or if it's not convenient to find, then I could still possibly reproduce this, next time I have access to a macOS system.)
|
That's great to hear, thanks for checking!
I was wondering the same thing, but also believe to remember that name in Git somewhere. |
Discussed in: GitoxideLabs#1730 (comment) At least for now, they remain called `NonFile`s (and sometimes referred to as "non-files" in text), but more specifically defined.
I didn't find that, or not yet. I have a bit more detail in the #1734 description, though that PR does not change the name. |
|
It's true, I also couldn't find that |
Fixes #1729.
Tasks
NonFile+PrunedNoneFile+ default status handlingI think the learning is that
NonFileis a file type downstream has to deal with officially, just pruning these leads to unexpected results.Git will officially say that it won't diff or read or add non-files, and to do that, such entries must not be pruned, but they must be part of the normal entry handling that should always or most of the time reject non-files.